Corpus-Based Analyses of Adjectives: Automatic Clustering
نویسنده
چکیده
Similarity analysis is a substantial issue in both corpus-based researches and language usages. This paper focuses on the semantic usages of adjectives, and analyzes the similarities among adjectives. The adjective and the semantic tag of the head noun that it modifies in a noun phrase form a co-occurrence. A two-stage algorithm is applied to clustering the adjectives according to these co-occurrence relationships. Experimental results show that we break even the two issues of large data clustering and meaningful clustering. Paper Category: Topical Paper. Topic Area: Corpus Linguistics, Similarity Analysis, Clustering.
منابع مشابه
Towards the Automatic Identification of Adjectival Scales: Clustering Adjectives According to Meaning
In this paper we present a method to group adjectives according to their meaning, as a first step towards the automatic identification of adjectival scales. We discuss the properties of adjectival scales and of groups of semantically related adjectives and how they imply sources of linguistic knowledge in text corpora. We describe how our system exploits this linguistic knowledge to compute a m...
متن کاملAugmenting Lexicons Automatically: Clustering Semantically Related Adjectives
Our work focuses on identifying various types of lexical data in large corpora through statistical analysis. In this paper, we present a method for grouping adjectives according to their meaning, as a step towards the automatic identification of adjectival scales. We describe how our system exploits two sources of linguistic knowledge in a corpus to compute a measure of similarity between two a...
متن کاملSemantic Clustering of Adjectives and Verbs Using their Syntactic Patterns
The idea that the syntactic behaviour of words is connected with their meaning has been the assumption behind research in elds such as lexical semantics and automatic clustering of words based on statistical methods. In particular much work has been done to describe the relation between the semantic characteristics of verbs and their syntactic patterns, among many Fillmore (1970) and Levin (199...
متن کاملAdjective Density as a Text Formality Characteristic for Automatic Text Classification: A Study Based on the British National Corpus
In this article, we report significant findings resulting from an investigation into the correlation between adjective density, calculated as the proportion of adjectives in word tokens, and degrees of text formality as part of an attempt to examine the potential application of adjectives in automatic text classification and identification. Correlations obtained from the training corpus will be...
متن کاملNouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space
We propose an approach to adjective-noun composition (AN) for corpus-based distributional semantics that, building on insights from theoretical linguistics, represents nouns as vectors and adjectives as data-induced (linear) functions (encoded as matrices) over nominal vectors. Our model significantly outperforms the rivals on the task of reconstructing AN vectors not seen in training. A small ...
متن کامل